The main goal of this notebook is to evaluate the usefulness of seqICP when applied on simulations of a simple VAR(1) process.
First, lets run the example for the package documentation so as to check everything is fine.
Let \(e \in \{a,b\}\) be two different environments governed by the following SCM:
\[ x^a_{1,t} = N_a(0.3, 0.09) \\ x^a_{3,t} = x^a_{1,t} + N_a(0.2, 0.04) \\ y^a_{t} = -0.7x^a_{1,t} + 0.6x^a_{3,t} + N_a(0.1, 0.01) \\ x^a_{2,t} = -0.5*y^a_{t} + 0.5*x^a_{3,t} + N_a(0.1, 0.01) \]
\[ x^b_{1,t} = N_b(0.3, 0.09) \\ x^b_{3,t} = N_b(0.5, 0.25) \\ y^b_{t} = -0.7x^b_{1,t} + 0.6x^b_{3,t} + N_b(0.1, 0.01) \\ x^a_{2,t} = -0.5*y^a_{t} + 0.5*x^a_{3,t} + N_a(0.1, 0.01) \]
Thus, a change has occurred on the distribution of \(x^b_{3,t}\) between environments, whereas the SCM of \(y_t\) remained constant. More precisely, we have:
\[ x^a_{3,t} \sim N(x^a_{1,t} + 0.2, 0.04) \\ x^b_{3,t} \sim N_b(0.5, 0.25) \\ \]
and
\[ y^e_{t} = -0.7x^e_{1,t} + 0.6x^e_{3,t} + N_e(0.1, 0.01) \\ \] regardless of \(e\). Lets sample the above process:
set.seed(1)
# environment 1
na <- 140
X1a <- 0.3*rnorm(na)
X3a <- X1a + 0.2*rnorm(na)
Ya <- -.7*X1a + .6*X3a + 0.1*rnorm(na)
X2a <- -0.5*Ya + 0.5*X3a + 0.1*rnorm(na)
# environment 2
nb <- 80
X1b <- 0.3*rnorm(nb)
X3b <- 0.5*rnorm(nb)
Yb <- -.7*X1b + .6*X3b + 0.1*rnorm(nb)
X2b <- -0.5*Yb + 0.5*X3b + 0.1*rnorm(nb)
# combine environments
X1 <- c(X1a,X1b)
X2 <- c(X2a,X2b)
X3 <- c(X3a,X3b)
Y <- c(Ya,Yb)
Xmatrix <- cbind(X1, X2, X3)
svar_sim <- cbind(Y, Xmatrix)
Here are the autocorrelation functions
for (i in 1:dim(svar_sim)[2]){
print(paste0("x", i))
ggtsdisplay(svar_sim[, i])
}
## [1] "x1"
## [1] "x2"
## [1] "x3"
## [1] "x4"
and the autocorrelation of the square of the series
for (i in 1:dim(svar_sim)[2]){
print(paste0("x", i))
ggtsdisplay(svar_sim[, i]^2)
}
## [1] "x1"
## [1] "x2"
## [1] "x3"
## [1] "x4"
We can note a few interesting observations from the DGP definition and the ACF/PACF.First, from the DGP definition we can conclude that there are only contemporaneous dependencies in the time series. Despite this fact, from a quick inspection of the ACF and PACF’s of the raw time series it seems that the shift on \(x_{3,t}\)’s distribution has introduced some time dependencies on \(y_t\) and itself. For instance, lags two and five for the ACF of \(y_t\) and lag five for the ACF of \(x_{3,t}\) are significant. Third, we can see that shift in \(x_{3,t}\) distribution translates into a shift in \(y_t\)’s variance. From a a visual inspection of the time series of the squares of \(y_t\) and \(x_{3,t}\) we can see that there is a big shift in its variance starting from timestep 130. Furthermore, this shift has introduced arch effects into the time series, which translates into heteroskedasticity of the processes.
From the above observations we can make the following statements about the target \(y_t\):
- It is non-stationary
- Its DGP is invariant across enviroments
- The main shift is in its variance
- There are only comtemporaneous dependencies in the DGP
We now run seqICP on the simulations:
seqICP.result <- seqICP(X = Xmatrix,
Y = Y,
stopIfEmpty=FALSE,
silent=FALSE)
## Currently fitting set S = {}
## p-value: 0.02
## Currently fitting set S = {1}
## p-value: 0.02
## Currently fitting set S = {2}
## p-value: 0.02
## Currently fitting set S = {3}
## p-value: 0.02
## Currently fitting set S = {1, 2}
## p-value: 0.02
## Currently fitting set S = {1, 3}
## p-value: 0.91
## Currently fitting set S = {2, 3}
## p-value: 0.02
## Currently fitting set S = {1, 2, 3}
## p-value: 0.22
summary(seqICP.result)
##
## Invariant Linear Causal Regression at level 0.05
## Variables X1, X3 show a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept 0.0 -0.05900 0.0179 NA
## X1 -0.7 -0.75200 -0.5292 0.02 *
## X2 0.0 0.00000 0.0000 0.91
## X3 0.6 0.57000 0.7228 0.02 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
seqICP.result <- seqICP(X = Xmatrix,
Y = Y,
model = "ar",
stopIfEmpty=FALSE,
silent=FALSE)
## Currently fitting set S = {}
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 215 are removed, to account for lag p=5
## p-value: 0.02
## Currently fitting set S = {1}
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 215 are removed, to account for lag p=5
## p-value: 0.02
## Currently fitting set S = {2}
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 215 are removed, to account for lag p=5
## p-value: 0.02
## Currently fitting set S = {3}
## p-value: 0.02
## Currently fitting set S = {1, 2}
## p-value: 0.02
## Currently fitting set S = {1, 3}
## p-value: 0.97
## Currently fitting set S = {2, 3}
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 218 are removed, to account for lag p=2
## p-value: 0.08
## Currently fitting set S = {1, 2, 3}
## p-value: 0.2
summary(seqICP.result)
##
## Invariant Linear Causal Regression at level 0.05
## Variable X3[t] shows a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept -0.01 -0.0710 0.0179 NA
## X1[t] 0.00 0.0000 0.0000 0.079 .
## X2[t] 0.00 0.0000 0.0000 0.970
## X3[t] 0.44 0.5700 0.7817 0.020 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
From the above results we can see that seqICP was able to find the correct set of causal parents even if we didn’t know that no lags were present on the original DGP.
Now we proceed by testing seqICP on a multivariate time series process that has lag dependencies across variables but no clear shift in distribution. Consider the following DGP:
\[ x_{1,t} = 0.2x_{1,t-1} + 0.7x_{3,t-1} + N(0, 1)\\ x_{2,t} = 0.2x_{2,t-1} + N(0, 1)\\\ x_{3,t} = 0.7x_{1,t-1} + 0.2x_{2,t-1} + N(0, 1)\\\ \]
## VAR.sim: simulate VAR as in Enders 2004, p 268
B1 <- matrix(c(0.2, 0, 0.7, 0, 0.2, 0, 0.7, 0.2, 0), nrow = 3, ncol = 3, byrow = TRUE)
var_sim <- VAR.sim(B=B1, n=na+nb, include="none")
Here are the autocorrelation functions
for (i in 1:dim(var_sim)[2]){
print(paste0("x", i))
ggtsdisplay(var_sim[, i])
}
## [1] "x1"
## [1] "x2"
## [1] "x3"
and the autocorrelation of the square of the series
for (i in 1:dim(var_sim)[2]){
print(paste0("x", i))
ggtsdisplay(var_sim[, i]^2)
}
## [1] "x1"
## [1] "x2"
## [1] "x3"
From a visual inspection of the ACFs and PACFs of the raw and squares of
the time series we can notice two main differences. First, all raw time
series have a much stronger time dependence which is induced by the lags
of the time series. Furthermore, there is no clear shift in mean and/or
variance of the series. Therefore we can conclude that the process is
jointly stationary.
var_sim_df <- var_sim %>% as.data.table()
colnames(var_sim_df) <- c("x1", "x2", "x3")
varnames <- colnames(var_sim_df)
Ss <- list()
Sdetails <- list()
for (vn in varnames){
X <- var_sim_df %>% dplyr::select(-one_of(vn))
y <- var_sim_df %>% dplyr::select(one_of(vn))
output <- seqICP(X=X, Y=y, model = "ar",stopIfEmpty = FALSE, silent = TRUE)
Sdetails[[vn]] <- output
print(paste0("Target: ", vn, " Features: ", paste(colnames(X), collapse = " ")))
summary(output)
}
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## [1] "Target: x1 Features: x2 x3"
##
## Invariant Linear Causal Regression at level 0.05
## No variable shows a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept -0.03 -0.1878 0.0984 NA
## X1[t] 0.00 0.0000 0.0000 0.95
## X2[t] 0.00 0.0000 0.0000 0.93
## Y0[t-1] 0.20 -0.1345 0.2936 NA
## X1[t-1] 0.00 -0.1141 0.2947 NA
## X2[t-1] 0.67 -0.1636 0.7667 NA
##
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## [1] "Target: x2 Features: x1 x3"
##
## Invariant Linear Causal Regression at level 0.05
## No variable shows a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept 0.05 -0.0950 0.205 NA
## X1[t] 0.00 0.0000 0.000 1
## X2[t] 0.00 0.0000 0.000 1
## Y0[t-1] 0.10 -0.1738 0.231 NA
## X1[t-1] 0.05 -0.1732 0.245 NA
## X2[t-1] -0.07 -0.1886 0.244 NA
##
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 212 are removed, to account for lag p=8
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 212 are removed, to account for lag p=8
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## [1] "Target: x3 Features: x1 x2"
##
## Invariant Linear Causal Regression at level 0.05
## No variable shows a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept 0.22 0.0430 0.373 NA
## X1[t] 0.00 0.0000 0.000 0.22
## X2[t] 0.00 0.0000 0.000 0.36
## Y0[t-1] 0.01 -0.1340 0.319 NA
## X1[t-1] 0.65 -0.2700 0.794 NA
## X2[t-1] 0.28 -0.2700 0.796 NA
## Y0[t-2] 0.07 -0.1060 0.758 NA
## X1[t-2] -0.06 -0.2250 0.404 NA
## X2[t-2] -0.08 -0.2260 0.220 NA
## Y0[t-3] -0.08 -0.2540 0.142 NA
## X1[t-3] 0.01 -0.2560 0.177 NA
## X2[t-3] -0.03 -0.2570 0.177 NA
## Y0[t-4] -0.15 -0.3200 0.175 NA
## X1[t-4] 0.03 -0.3200 0.201 NA
## X2[t-4] 0.18 -0.2720 0.314 NA
## Y0[t-5] 0.02 -0.1530 0.314 NA
## X1[t-5] 0.02 -0.1530 0.287 NA
## X2[t-5] 0.04 -0.1810 0.191 NA
## Y0[t-6] -0.09 -0.2600 0.174 NA
## X1[t-6] 0.09 -0.2600 0.261 NA
## X2[t-6] -0.08 -0.2090 0.261 NA
## Y0[t-7] -0.09 -0.2540 0.209 NA
## X1[t-7] 0.17 -0.2540 0.344 NA
## X2[t-7] 0.03 -0.1050 0.344 NA
## Y0[t-8] -0.10 -0.2440 0.160 NA
## X1[t-8] 0.03 -0.2440 0.168 NA
## X2[t-8] 0.10 -0.1040 0.227 NA
##
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
From the above results we can see that seqICP was not able to find the correct set of causal parents for any of the equations. This result actually makes sense since our original DGP is stationary, and thus violates one of the assumptions of seqICP.
To solve for the stationarity issue we had before, we will try to artificially introduce a variance shift into the VAR(1) process we defined before.
To do so, recall that \(e \in \{a,b\}\) are two different environments. We define the following DGP for the variance shifted VAR(1) process:
\[ x^a_{1,t} = 0.2x^a_{1,t-1} + 0.7x^a_{3,t-1} + N(0, 1)\\ x^a_{2,t} = 0.2x^a_{2,t-1} + N(0, 1)\\ x^a_{3,t} = 0.7x^a_{1,t-1} + 0.2x^a_{2,t-1} + N(0, 1)\\ \]
\[ x^b_{1,t} = 0.2x^b_{1,t-1} + 0.7x^b_{3,t-1} + N(0, 1)\\ x^b_{2,t} = 0.2x^b_{2,t-1} + N(0, 1)\\ x^b_{3,t} = 0.7x^b_{1,t-1} + 0.2x^a_{2,t-1} + N(0, 5)\\ \]
Thus, the SCM of \(x_{1,t}\) is the same regardless of \(e\) but there is a variance shift between the environments caused by a change in distribution in \(x_{3,t-1}\).
Below we will sample from the above model:
set.seed(1)
n <- na + nb
x1 <- matrix(data = NA, nrow = n, ncol = 1)
x2 <- matrix(data = NA, nrow = n, ncol = 1)
x3 <- matrix(data = NA, nrow = n, ncol = 1)
for (i in 1:n){
# initialize all series
if (i == 1){
x1[i,] <- rnorm(n = 1, mean = 0, sd = 1)
x2[i,] <- rnorm(n = 1, mean = 0, sd = 1)
x3[i,] <- rnorm(n = 1, mean = 0, sd = 1)
next
}
# sample time series iteratively conditioned on each environment
if (i <= na){
x1[i,] <- 0.2*x1[i-1,] + 0.7*x3[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
x2[i,] <- 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
x3[i,] <- 0.7*x1[i-1,] + 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
}
else{
x1[i,] <- 0.2*x1[i-1,] + 0.7*x3[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
x2[i,] <- 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
x3[i,] <- 0.7*x1[i-1,] + 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 3)
}
}
var_sim_shifted <- cbind(x1, x2, x3)
ts.plot(var_sim_shifted, type="l", col=c(1, 2, 3))
for (i in 1:dim(var_sim_shifted)[2]){
print(paste0("x", i))
ggtsdisplay(var_sim_shifted[, i])
}
## [1] "x1"
## [1] "x2"
## [1] "x3"
and the autocorrelation of the square of the series
for (i in 1:dim(var_sim_shifted)[2]){
print(paste0("x", i))
ggtsdisplay(var_sim_shifted[, i]^2)
}
## [1] "x1"
## [1] "x2"
## [1] "x3"
From the above ACF and PACFs we can see that the stationarity property was broken by introducing the shift in variance on \(x_{3,t}\).
Finally, we run seqICP on the new dataset:
var_sim_shifted_df <- var_sim_shifted %>% as.data.table()
colnames(var_sim_shifted_df) <- c("x1", "x2", "x3")
varnames <- colnames(var_sim_shifted_df)
Ss <- list()
Sdetails <- list()
for (vn in varnames){
X <- var_sim_shifted_df %>% dplyr::select(-one_of(vn))
y <- var_sim_df %>% dplyr::select(one_of(vn))
output <- seqICP(X=X, Y=y, model = "ar", stopIfEmpty = FALSE, silent = TRUE, )
Sdetails[[vn]] <- output
print(paste0("Target: ", vn, " Features: ", paste(colnames(X), collapse = " ")))
summary(output)
}
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7
## [1] "Target: x1 Features: x2 x3"
##
## Invariant Linear Causal Regression at level 0.05
## No variable shows a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept 0.10 -0.05900 0.2572 NA
## X1[t] 0.00 0.00000 0.0000 1
## X2[t] 0.00 0.00000 0.0000 1
## Y0[t-1] 0.29 -0.11800 0.4241 NA
## X1[t-1] -0.04 -0.19600 0.4247 NA
## X2[t-1] 0.03 -0.21300 0.4244 NA
## Y0[t-2] 0.40 -0.21500 0.5382 NA
## X1[t-2] 0.00 -0.16100 0.5440 NA
## X2[t-2] 0.02 -0.16700 0.5443 NA
## Y0[t-3] -0.05 -0.20300 0.1621 NA
## X1[t-3] -0.08 -0.24000 0.1006 NA
## X2[t-3] 0.04 -0.24300 0.1105 NA
## Y0[t-4] 0.05 -0.24300 0.2016 NA
## X1[t-4] -0.08 -0.24300 0.2032 NA
## X2[t-4] -0.01 -0.24600 0.2018 NA
## Y0[t-5] -0.06 -0.24700 0.0872 NA
## X1[t-5] 0.14 -0.22100 0.3035 NA
## X2[t-5] -0.04 -0.22600 0.3071 NA
## Y0[t-6] -0.10 -0.23600 0.3057 NA
## X1[t-6] -0.18 -0.34900 0.0491 NA
## X2[t-6] 0.01 -0.34800 0.0738 NA
## Y0[t-7] 0.15 -0.34400 0.2834 NA
## X1[t-7] -0.11 -0.27200 0.2885 NA
## X2[t-7] 0.04 -0.27600 0.2894 NA
##
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
##
## [1] "Target: x2 Features: x1 x3"
##
## Invariant Linear Causal Regression at level 0.05
## No variable shows a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept 0.04 -0.1060 0.179 NA
## X1[t] 0.00 0.0000 0.000 0.48
## X2[t] 0.00 0.0000 0.000 0.44
##
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## [1] "Target: x3 Features: x1 x2"
##
## Invariant Linear Causal Regression at level 0.05
## No variable shows a significant causal effect
##
## coefficient lower bound upper bound p-value
## intercept 0.17 -0.00100 0.3401 NA
## X1[t] 0.00 0.00000 0.0000 1
## X2[t] 0.00 0.00000 0.0000 1
## Y0[t-1] 0.11 -0.09800 0.2444 NA
## X1[t-1] 0.07 -0.09800 0.2461 NA
## X2[t-1] 0.04 -0.12300 0.2455 NA
## Y0[t-2] 0.43 -0.14600 0.5670 NA
## X1[t-2] 0.08 -0.14300 0.5772 NA
## X2[t-2] -0.02 -0.18500 0.5791 NA
## Y0[t-3] 0.02 -0.19100 0.1764 NA
## X1[t-3] -0.06 -0.19100 0.1778 NA
## X2[t-3] -0.17 -0.33900 0.1786 NA
## Y0[t-4] 0.00 -0.34100 0.1539 NA
## X1[t-4] -0.05 -0.34100 0.1524 NA
## X2[t-4] 0.05 -0.15200 0.2208 NA
## Y0[t-5] 0.01 -0.14400 0.2180 NA
## X1[t-5] 0.04 -0.14600 0.2134 NA
## X2[t-5] 0.04 -0.14700 0.2103 NA
## Y0[t-6] -0.05 -0.19000 0.2082 NA
## X1[t-6] -0.05 -0.19200 0.2045 NA
## X2[t-6] -0.17 -0.33300 0.0885 NA
##
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
As we can see, seqICP was not able to identify the invariant set correctly for the variance shifted VAR(1) process.